Goto

Collaborating Authors

 bridge function


DeepProxyCausalLearninganditsApplicationto ConfoundedBanditPolicyEvaluation

Neural Information Processing Systems

Proxy causal learning (PCL) isamethod forestimating thecausal effectoftreatments on outcomes in the presence of unobserved confounding, usingproxies (structured side information) for the confounder.


On Multiple Robustness of Proximal Dynamic Treatment Regimes

Gao, Yuanshan, Bai, Yang, Cui, Yifan

arXiv.org Machine Learning

Dynamic treatment regimes are sequential decision rules that adapt treatment according to individual time-varying characteristics and outcomes to achieve optimal effects, with applications in precision medicine, personalized recommendations, and dynamic marketing. Estimating optimal dynamic treatment regimes via sequential randomized trials might face costly and ethical hurdles, often necessitating the use of historical observational data. In this work, we utilize proximal causal inference framework for learning optimal dynamic treatment regimes when the unconfoundedness assumption fails. Our contributions are four-fold: (i) we propose three nonparametric identification methods for optimal dynamic treatment regimes; (ii) we establish the semiparametric efficiency bound for the value function of a given regime; (iii) we propose a (K+1)-robust method for learning optimal dynamic treatment regimes, where K is the number of stages; (iv) as a by-product for marginal structural models, we establish identification and estimation of counterfactual means under a static regime. Numerical experiments validate the efficiency and multiple robustness of our proposed methods.


Coupling Generative Modeling and an Autoencoder with the Causal Bridge

Meng, Ruolin, Chung, Ming-Yu, Brahma, Dhanajit, Henao, Ricardo, Carin, Lawrence

arXiv.org Machine Learning

We consider inferring the causal effect of a treatment (intervention) on an outcome of interest in situations where there is potentially an unobserved confounder influencing both the treatment and the outcome. This is achievable by assuming access to two separate sets of control (proxy) measurements associated with treatment and outcomes, which are used to estimate treatment effects through a function termed the em causal bridge (CB). We present a new theoretical perspective, associated assumptions for when estimating treatment effects with the CB is feasible, and a bound on the average error of the treatment effect when the CB assumptions are violated. From this new perspective, we then demonstrate how coupling the CB with an autoencoder architecture allows for the sharing of statistical strength between observed quantities (proxies, treatment, and outcomes), thus improving the quality of the CB estimates. Experiments on synthetic and real-world data demonstrate the effectiveness of the proposed approach in relation to the state-of-the-art methodology for proxy measurements.




Density Ratio-Free Doubly Robust Proxy Causal Learning

Bozkurt, Bariscan, Zenati, Houssam, Meunier, Dimitri, Xu, Liyuan, Gretton, Arthur

arXiv.org Machine Learning

We study the problem of causal function estimation in the Proxy Causal Learning (PCL) framework, where confounders are not observed but proxies for the confounders are available. Two main approaches have been proposed: outcome bridge-based and treatment bridge-based methods. In this work, we propose two kernel-based doubly robust estimators that combine the strengths of both approaches, and naturally handle continuous and high-dimensional variables. Our identification strategy builds on a recent density ratio-free method for treatment bridge-based PCL; furthermore, in contrast to previous approaches, it does not require indicator functions or kernel smoothing over the treatment variable. These properties make it especially well-suited for continuous or high-dimensional treatments. By using kernel mean embeddings, we have closed-form solutions and strong consistency guarantees. Our estimators outperform existing methods on PCL benchmarks, including a prior doubly robust method that requires both kernel smoothing and density ratio estimation.


Proximal Inference on Population Intervention Indirect Effect

Bai, Yang, Cui, Yifan, Sun, Baoluo

arXiv.org Machine Learning

Additionally, experiments have shown that depersonalization symptoms can arise as a reaction to alcohol consumption (Raimo et al., 1999), and they are increasingly recognized as a significant prognostic factor in the course of depression (Michal et al., 2024). Despite these findings, little research has explored the mediating role of depersonalization symptoms in the causal pathway from alcohol consumption to depression. In this paper, we propose a methodological framework to evaluate the indirect effect of alcohol consumption on depression, with depersonalization acting as a mediator. To ground our analysis, we use data from a cross-sectional survey conducted during the COVID-19 pandemic by Dom ınguez-Espinosa et al. (2023) as a running example. In observational studies, the population average causal effect (ACE) and the natural indirect effect (NIE) are the most commonly used measures of total and mediation effects, respectively, to compare the outcomes of different intervention policies. For instance, in our running example, these two measures compare the depression outcomes between individuals engaging in hazardous versus non-hazardous alcohol consumption. However, clinical practice imposes ethical constraints, as healthcare professionals would not prescribe harmful levels of alcohol consumption. As a result, hypothetical interventions involving dangerous exposure levels are unrealistic. To address this situation with potentially harmful exposure, Hubbard and Van der Laan (2008) propose the population intervention effect (PIE), which contrasts outcomes between the natural population and a hypothetical population where no one is exposed to the harmful exposure level.


Density Ratio-based Proxy Causal Learning Without Density Ratios

Bozkurt, Bariscan, Deaner, Ben, Meunier, Dimitri, Xu, Liyuan, Gretton, Arthur

arXiv.org Artificial Intelligence

We address the setting of Proxy Causal Learning (PCL), which has the goal of estimating causal effects from observed data in the presence of hidden confounding. Proxy methods accomplish this task using two proxy variables related to the latent confounder: a treatment proxy (related to the treatment) and an outcome proxy (related to the outcome). Two approaches have been proposed to perform causal effect estimation given proxy variables; however only one of these has found mainstream acceptance, since the other was understood to require density ratio estimation - a challenging task in high dimensions. In the present work, we propose a practical and effective implementation of the second approach, which bypasses explicit density ratio estimation and is suitable for continuous and high-dimensional treatments. We employ kernel ridge regression to derive estimators, resulting in simple closed-form solutions for dose-response and conditional dose-response curves, along with consistency guarantees. Our methods empirically demonstrate superior or comparable performance to existing frameworks on synthetic and real-world datasets.


Proxy Methods for Domain Adaptation

Tsai, Katherine, Pfohl, Stephen R., Salaudeen, Olawale, Chiou, Nicole, Kusner, Matt J., D'Amour, Alexander, Koyejo, Sanmi, Gretton, Arthur

arXiv.org Machine Learning

We study the problem of domain adaptation under distribution shift, where the shift is due to a change in the distribution of an unobserved, latent variable that confounds both the covariates and the labels. In this setting, neither the covariate shift nor the label shift assumptions apply. Our approach to adaptation employs proximal causal learning, a technique for estimating causal effects in settings where proxies of unobserved confounders are available. We demonstrate that proxy variables allow for adaptation to distribution shift without explicitly recovering or modeling latent variables. We consider two settings, (i) Concept Bottleneck: an additional ''concept'' variable is observed that mediates the relationship between the covariates and labels; (ii) Multi-domain: training data from multiple source domains is available, where each source domain exhibits a different distribution over the latent confounder. We develop a two-stage kernel estimation approach to adapt to complex distribution shifts in both settings. In our experiments, we show that our approach outperforms other methods, notably those which explicitly recover the latent confounder.


A Policy Gradient Method for Confounded POMDPs

Hong, Mao, Qi, Zhengling, Xu, Yanxun

arXiv.org Machine Learning

In this paper, we propose a policy gradient method for confounded partially observable Markov decision processes (POMDPs) with continuous state and observation spaces in the offline setting. We first establish a novel identification result to non-parametrically estimate any history-dependent policy gradient under POMDPs using the offline data. The identification enables us to solve a sequence of conditional moment restrictions and adopt the min-max learning procedure with general function approximation for estimating the policy gradient. We then provide a finite-sample non-asymptotic bound for estimating the gradient uniformly over a pre-specified policy class in terms of the sample size, length of horizon, concentratability coefficient and the measure of ill-posedness in solving the conditional moment restrictions. Lastly, by deploying the proposed gradient estimation in the gradient ascent algorithm, we show the global convergence of the proposed algorithm in finding the history-dependent optimal policy under some technical conditions. To the best of our knowledge, this is the first work studying the policy gradient method for POMDPs under the offline setting.